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We formulate the problem of probabilistic predictions of global failure in the simplest possible model based 
on site percolation and on one of the simplest model of time-dependent rupture, a hierarchical fiber bundle 
model. We show that conditioning the predictions on the knowledge of the current degree of damage (occupancy 
density p or number and size of cracks) and on some information on the largest cluster improves significantly 
the prediction accuracy, in particular by allowing to identify those realizations which have anomalously low or 
large clusters (cracks). We quantify the prediction gains using two measures, the relative specific information 
gain (which is the variation of entropy obtained by adding new information) and the root-mean-square of the 
prediction errors over a large ensemble of realizations. The bulk of our simulations have been obtained with 
the two-dimensional site percolation model on a lattice of size L x L = 20 x 20 and hold true for other 
lattice sizes. For the hierarchical fiber bundle model, conditioning the measures of damage on the information 
of the location and size of the largest crack extends significantly the critical region and the prediction skills. 
These examples illustrate how on-going damage can be used as a revelation of both the realization-dependent 
pre-existing heterogeneity and the damage scenario undertaken by each specific sample. 

PACS numbers: 62.20.Mk; 61.43.-j; 91.30.Px 



I. INTRODUCTION 

The idea underlying this paper was inspired by the 
method of "reverse tracing of precursors" (RTP) intro- 
duced in Refs. (J 0] as a method of earthquake pre- 
diction based on seismicity patterns. In a nutshell, the 
RTP method consists first in delineating a spatial do- 
main S(t) by using a space-time correlation analysis of 
past seismicity up to the present time t and then in con- 
structing precursory diagnostics based on past seismic- 
ity restricted to this spatial domain S(t) (called chains 
in Refs. CI 01). In Refs. idH the precursory func- 
tions used to issue a prediction are based on previously 
documented seismic anomalies (see |3] and references 
therein) and will not be our concern. Rather, the ques- 
tion we are asking is what could justify the innovation 
presented in Refs. |2[] to constrain the construction 
of precursory diagnostics to some special spatial do- 
mains recognized from some spatio-temporal correla- 
tion analysis of past seismicity? Indeed, Refs. QQ] do 
not provide an explanation on why their method should 
work and what could be its underlying physical mecha- 
nism^), since their approach is based on the pragmatic 
mathematical pattern recognition method initiated long 
ago by Gelfand et al. |4]. Our paper is the first one in a 
series which shows how the idea behind the RTP can be 



'Electronic address: vitting@unice.fr, soraette@moho.ess.ucla.edu 



actually justified on physical grounds and used for im- 
proving previous prediction methods for earthquakes 
or material ruptures. 

We first present the problem and explore its implica- 
tions for the percolation model and then test the robust- 
ness of the results and extend them to a time-dependent 
hierarchical fiber bundle model. 



II. FORMULATION OF THE PERCOLATION 
MODEL 

As a first step, we propose to formulate the problem 
with perhaps the simplest model of heterogenous me- 
dia undergoing a transition, the site percolation model 
01001 By doing so, we aim at capturing the essence 
of the idea. 

Consider a two-dimensional lattice of L x L sites 
which are initially empty. We then fill one by one 
the sites at random positions and denote by p the cor- 
responding fraction of occupied sites. Any given re- 
alization Cl will be characterized by some thresh- 
old Pc(Cl) at which the occupied sites form a cluster 
which barely percolates from one side of the system 
to its opposite side. It is known that, for L — » oo, 
Pc(Cl) becomes independent of the specific realiza- 
tion of the system and converge to a unique number 
pf = 0.5927460 ± 0.0000005 fl. It is also well- 
known that, for finite L, p c (Cl) is a random number 
distributed according to a probability density function 
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(PDF) P(p c ) centered on a value shifted downwards 
from p^ by an amount and with a width which are 
both proportional to 1/L 1 ^, where v = 4/3 is the 
universal exponent (in two-dimensions) of the corre- 
lation length, defined roughly speaking as the typical 
size of the largest cluster. The shift and width of the 
PDF P{p c ) are characteristic of the so-called finite-size 
scaling of the critical percolation transition 

For our purpose which is to relate with the prediction 
of a rupture or an earthquake, we interpret p as the run- 
ning time, which is also the fraction of the lattice which 
is damaged. We thus envision the two-dimensional lat- 
tice as being progressively damaged at a rate of one site 
failing per unit time. The percolation threshold p c (Cl) 
then corresponds to the time when there is a connected 
path of damaged sites running from one side to the 
other, such that the system is deconnected into at least 
two pieces, a diagnostic of rupture. Hence, the pro- 
gressive filling of the sites in the percolation problem 
described above corresponds to the progressive damage 
of an initially pristine system. 

Roux et al. fioll have shown that rupture is equiva- 
lent to percolation in the limit of very large disorder 
and, by extension, rupture processes can be consid- 
ered as nothin g but (c omplicated) correlated percola- 
tion problems fill. fl2fl . Since, by definition, the addi- 
tion of new sites in percolation model has no interac- 
tion, correlation or memory of the past, the formulation 
of the idea inspired by the RTP method in this context 
necessarily reduces the scope of the approach. This 
is because the information present in real rupture and 
earthquake cases based on correlation and memory in 
the time domain has no bearing in the prediction of the 
percolation threshold p c (Cl)- In subsequent papers, 
we will investigate different examples of "correlated" 
percolation, namely models of rupture, in which time- 
dependent precursors can be coupled with the spatial 
organization of damage. 

III. PREDICTIONS OF THE PERCOLATION 
THRESHOLD 

A. A hierarchy of prediction levels 

Suppose that a given realization in a system of size 
L x L is at the cumulative fraction p of damaged sites. 
What level of prediction is possible for its percolation 
threshold j» c (Cl)? We now describe different levels 
of prediction of the percolation threshold based on in- 
creasing the available information. 

1 . The first level of prediction is what we call the 
unconditional prediction, which amounts to not 
even use the knowledge that the system has the 
cumulative fraction p of damaged sites. It cor- 
responds to the statistical distribution of p c (Cl). 



This is the information available at the beginning 
of a simulation. 

2. The second level of prediction is to use the fact 
that we want to predict p c (Cl) conditioned on 
the fact that we know that the system has reached 
the cumulative fraction p of damaged sites. It is 
obvious that this improves on the first level: for 
instance, if by luck, p happens to be already quite 
large (say larger than the average of p c (Cl)) and 
the system is still not percolating, then we know 
for sure that the value of p c (Cl) for this system 
will be larger than p. 

3. The third level of prediction incorporates addi- 
tional information on how the damage over the 
pi? sites is organized. For instance, typical ex- 
periments of rupture have access to the spatial or- 
ganization of acoustic emissions, which provide 
clues on the localization of damage. In this spirit, 
suppose that we can measure the fraction of dam- 
aged sites belonging to the larger cluster at p or 
the size along the horizontal and vertical direc- 
tions of the larger cluster. Then, this should give 
us some additional information to improve on the 
prediction. Indeed, if we measure for two given 
realizations that the larger cluster has a horizon- 
tal size close to L in the first one and L/2 in the 
second one for a given p, we can guess that the 
first system will in general percolate sooner (for 
a smaller p c (Cl)) than the second system. 

4. One can imagine many other levels of prediction 
using all kinds of additional information, such 
as the statistics of the clusters, their shape, posi- 
tions, etc. 

5. The last ultimate level of prediction is to use all 
the information on the exact locations of all dam- 
aged sites and condition the prediction of p c {Cl) 
on this knowledge. 

In the following, we implement the first three lev- 
els of predictions and show that we obtain substantial 
gains at the third level. This is perhaps not surprising, 
but this provides a quantitative demonstration on how 
prediction can be improved by using information on the 
spatial organization of damage. Additionally, it tells us 
what are the limits of predictability, given each level of 
information. 



B. First and second prediction levels 

The first prediction level described in section llil Al 
amounts to constructing the standard probability dis- 
tribution function (PDF) Pl{p c ) of the percolation 
thresholds, shown by the circles in Figure[Ofor L = 20. 
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We have used 50 million realizations to get a good 
statistics. Such distribution is the standard tool for 
the study of finite-size scaling |9]. For our purpose, 
it quantifies the range of predictions for the percolation 
thresholds p c (Cl) in the form of a probabilistic fore- 
cast. 

Crosses, dots and squares show the second predic- 
tion level, corresponding to the PDF's Pj, (p c \p) condi- 
tioned on those systems which have not percolated for 
a fixed occupation density p = 0.50, 0.53 and p = 0.55 
respectively. Since for L = 20, the unconditional PDF 
Pl(Pc) is quite broad with as many as 40% of the re- 
alizations percolating with p c {Cl) < 0.55, the condi- 
tion that p c (Cl) has to be larger than 0.55 transforms 
Pl(Pc) into a significantly more peaked conditional 
PDF Pl(p c \p = 0.55). In the language of the pre- 
diction problem, the PDF's Pl{Pc\p) shown with the 
crosses, dots and squares provide the probabilistic fore- 
casts for rupture, available at "time" p and conditioned 
only on the knowledge of p. 



C. Third prediction level 

We implement the third prediction level described in 
in section llll Al in two ways. Let us call p^(p) the frac- 
tion of sites belonging to the largest cluster and 
the largest of the linear size projected on the x and y 
axes of the largest cluster within the system when the 
occupancy density is p. 

Figure|2]presents the PDF Pl(Pc\p,P^) conditioned 
on both p and p^—e%, for different values of p (crosses: 
p = 0.4, dots: p = 0.45, squares: p = 0.5). For 
comparison, the unconditional distribution of the first 
prediction level is also shown with open circles. The 
gradual shift of the PDF Pl(Pc\p,P^) to larger values 
of p c for increasing p shows that the measurement of 
the largest cluster size which is fixed at a given p in the 
percolating process makes it more likely to see perco- 
lation occuring at "late times" (i.e. for large p c 's) the 
larger the value of p. Intuitively, this just means that if 
one observes in two different systems for different val- 
ues of p the same concentration of the largest cluster, 
the system with the largest value of p is more likely to 
percolate at a later time. The shift and narrowing of the 
PDF's are clear illustration of the information one can 
gain by conditioning on relevant variables. 

Figure presents the PDF Pl(Pc\p,£,) conditioned 
on both p and £ = 0.2L for different values of p (dots: 
p = 0.35, squares: p = 0.4). For comparison, the 
unconditional distribution of the first prediction level 
is also shown with open circles. The results are similar 
to those presented in Figure|2] with a gradual shift and 
narrowing of the conditional PDF's to larger values of 
p c for increasing p. Using the largest projected cluster 
should give even more information on the final value of 



p c for a given system, since e.g. two systems with the 
same p and p%, but one having a more elongated largest 
cluster than the other, should help the former reach the 
percolation threshold sooner on average. 

Figure Fig3b shows as Figure[3]the PDF Pl (p c \p, <£) 
for a fixed p — 40% and different values of £: £/L = 
0.04 (crosses), £/L = 0.06 (dots), £/L = 0.08 
(squares) and £/L = 0.1 (triangles). The open circles 
represent the unconditional PDF Pl{Pc) for compari- 
son. 



IV. MEASURES OF GOODNESS OF THE THIRD 
LEVEL PREDICTIONS 

A. Information gain 

This standard measure of the improvement in the 
quality of forecasts when going from the first to the 
third prediction level is the information gain H — H sc , 
where H is the unconditional entropy defined by 

H = - J P{p c )\n{P{p c ))dp c . (1) 

We consider two possible conditional entropies 
H sc (p,p^) and H sc (p, £) associated with the two con- 
ditional schemes of the third level prediction discussed 
in the previous section: 

H sc (p,Pi or£) = - J P(p c \p,pt or£)ln(P(p c |p,p c or£)) . 

(2) 

The relative "specific information gain" I(p, p^ or £) is 
then defined by 

/(p,p C orO = -| (H-H sc (p, Ps ovO) ■ (3) 

Figure|5]shows I{p,p{) (panel a)) and I(p, £) (panel 
b)) as a function of p for various values of p^ and £. The 
relative specific information gains I(p,p^) and I(j>, £) 
have qualitatively the same behavior, characterized by 
three regimes. 

1 . For small values of p (the smaller p^ or £, the 
smaller the values of p for which this regime 
holds), we observe some information gain when 
adding the information on p^ or £. This infor- 
mation gain can be ascribed to the realizations 
which initially (i.e. for small p) have an ab- 
normal large value of p^ or £, and therefore are 
likely to percolate before the typical behavior. 
The knowledge of these anomalously large p^ or 
£, when they occur, gives an improvement for the 
prediction of the percolation of these systems. 
Translated in the context of the prediction of rup- 
ture, the information gain shown in Figure|5]for 
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small p's is based on the detection of anomalous 
cracks or defects at an early stage. It is important 
to stress that the information gain is not uniform 
over all realizations: most realizations are not 
much more predictable by adding the informa- 
tion on or £) for small p's; only those which 
have anomalous defects can be better predicted. 
This result is reasonable and retrieves the stan- 
dard approach in the applications of mechanical 
engineering to the prediction of rupture in which 
the major efforts are put in the detection of pos- 
sible initial flaws in the material or structure. 

2. For intermediate values of p, the information 
gain obtained by conditioning on p^ or £ is lim- 
ited if not negative, since for these values of p the 
imposed p^ or £ correspond to "normal" values. 

3. Finally, for the larger p's, the information gain 
accelerates and become large since it becomes 
very unlikely to observe systems with such small 
values of p^ or £. Therefore, the knowledge that 
a given realization has an anomalously small p^ 
or £ provides a highly meaningful information 
that percolation will require a much large value 
of p than the current value. 

While the relative specific information gains I(p,p^) 
and I(p, £) have qualitatively the same behavior, the 
gain is much larger for the later compared with the for- 
mer: this is because the geometrical size of the larger 
cluster is much more relevant for percolation than the 
total number of sites in the large cluster. 



B. RMS of prediction errors 

We now quantify the errors of the prediction of the 
realization specific percolation threshold p c {C£) based 
on the conditioning on p and p^ or £. We imagine a 
situation mimicking a real life situation in which one 
monitors the cumulative level of damage p of a sample 
as well as the largest crack in the system. Conditioned 
on the knowledge of p and p^ or £ for a given realiza- 
tion, how well can we predict the rupture time p c (Ci) 
of the sample? 

In order to address this question, we have first made 
50 million realizations of system sizes L = 20 to 
obtain a good estimate of the conditional distribu- 
tions P(pc\p,Ps,) and P(p c \p,0' which will be our 
prediction tools. Having sampled these conditional 
distributions, we then constructed additional realiza- 
tions that we monitored to measure their p^(p) and 
£(p) as a function of p. For a given realization at 
a given p, knowing the corresponding specific pg(p), 
our prediction is nothing but P(p c \p,P£ (?>))■ Simi- 
larly, for a given realization at a given p, knowing 



the corresponding specific £(p), our prediction is noth- 
ing but P(p c \p, £(p)). Note that our forecast are in- 
trinsically probabilistic, by construction. However, 
each probabilistic forecast can be translated into a sin- 
gle predicted number pP redlcted (p), for instance, the 
median of P(p c \p,P$,(p)) or P(p c |p, £(p)), comple- 
mented with an uncertainty given by some measure of 
the width of these distributions (standard deviation or 
quantiles). 

In order to assess the quality of such predictions, we 
need to construct statistics over ensembles of forecasts. 
In addition, we would like to study how the quality of 
the predictions evolve with the degree of damage p, in 
particular to test if we get advanced warning and how 
the prediction improves or deteriorates as a function 
of p. Since, for each p, we have two distributions of 
p^and£) which move with p, the amount of data to vi- 
sualize is too large to remain comprehensible. We pro- 
pose to focus on fixed quantiles q of the distributions of 
p^and£), say q = 5% and q — 95%, so that we issue 
predictions based on the pairs p,p|(p) (and similarly 
p, £ 9 (p)) where p|(p) (resp. £ 9 (p)) is the q-th quantile 
of the distribution of p^ (resp. £) for the cumulative 
damage p. 

For such a prediction, we can assess its error by con- 
structing the RMS (root-mean-square) of errors 

Q{p) = ((pP rcdictcd (p) - pT c ?) 1/2 , (4) 

where p* ruc is the true value observed for the given sys- 
tem and where pP rcdlctod (p) is our predicted value of p c 
for a given system and for a given p and using a given 
quantile q of the distribution of p^ (resp. £ 9 ) for the cu- 
mulative damage p. As our prediction p prodlctod (p) for 
Pc, we have used the median value of the conditional 
cumulative distribution defined by 

P<(pP rcdictod |p,pf (p) or C(P)) - 1/2 • (5) 

Figure ©shows Q(p,0 for q = 5% and q = 95%, 
when using P(p c |p, £ 9 (p)) as the predictor, as a func- 
tion of p. The triangles correspond to q = 5%, the 
dots to q = 95%, the crosses to q = 50%, while 
the circles show Q(p) obtained using the condition- 
ing only on p for comparison. The correspond figure 
when using P(p c |p, p^ (p) ) as the predictor is very sim- 
ilar and is thus not shown. Figure shows the gain 
in RMS Q{p) — Q(p, £) when adding the information 
on £. These figures show the result of an implementa- 
tion which mimics a real experiment of a material pro- 
gressively brought to failure: one would for a given 
time (that is p) measure the largest crack and, from 
the PDF's documented from earlier experiments, get 
an estimate of p c corresponding to that p, £. Notice 
that all three estimates in figure |6] coincide for small 
p's. The reason is of course that the PDF's for small 
p are very close to each other, whether the condition- 
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ing is on £ (corresponding to a small value of p) or just 
conditioned on p itself. 

These figures confirm the signicant gain in predic- 
tion accuracy when conditioning the forecast on the 
q = 5% and q — 95% quantiles of the distribution 
of £ (resp. p^). The q = 5% quantile selects those 
realizations such that their largest cluster is so small 
that 95% of the realizations have a bigger largest clus- 
ter. Conditioning on this information gives a significant 
gain in the forecast, especially for advanced warnings. 
The improvement deteriorates when p approaches the 
average percolation threshold and even changes sign 
with a worse quality for p larger than about 54%. We 
observe the opposite trend when conditioning on the 
q = 95% quantile of the distribution of £, correspond- 
ing to those realizations which have an anomalously 
big largest cluster so that only 5% of the realizations 
have a bigger largest cluster. In this case, the predic- 
tion accuracy is improved above forp > 0.45. 



V. HIERARCHICAL FIBER RUPTURE MODEL 
WITH TIME-DEPENDENCE 

The principles underlying the results on the percola- 
tion model presented above are of general validity. Our 
following papers will investigate their application and 
extension to other model systems and to different real 
systems including concrete engineering systems (ma- 
terial failure, structural collapse) and geophysical sys- 
tems (earthquakes, landslides). However, it is worth- 
while already to present preliminary results obtained 
on a more realistic (even still highly simplified) model 
of damage evolution and rupture, to illustrate our point. 



A. Definition of the hierarchical bundle model 

The model describes the time evolution of damage 
leading to the culminating global failure of a bundle of 
fibers in a cree p ex periment. The model has been stud- 
ied in I13l fl4l [1511 . Consider a hierarchical bundle of 
elastic fibers subjected to a constant stress load a per 
fiber applied at time t = 0. The topology of the sys- 
tem is as follows. Each fiber is associated with another 
fiber in a pair. Then, two neighboring pairs are asso- 
ciated to each other, forming a pair of two pairs, and 
so on iteratively up in a sequence of levels, thus defin- 
ing a discrete hierarchical tree of local coordination 2. 
A system containing n such levels has 2™ fibers. This 
topology impacts the dynamics of fiber rupture in the 
following way. When one of the two fibers of a given 
pair fails, its stress load is transfered instantaneously to 
the surviving fiber, such that its load is doubled. When 
this fibers breaks, its load is transfered to the pair of 
fibers associated to it if this second pair is still present. 



Otherwise, it is transfered to the pair of two pairs linked 
at the next hierarchical level. The last ingredient of the 
model is to specify how a fiber fails under a given stress 
load history. Given some stress history s(t'),t' > 0, a 
fiber is assumed to break at some fixed random time, 
where the probability that this random time takes a spe- 
cific value t is specified by its cumulative distribution 
function 

P (t) = J Po (t')dt' = 1-exp j-/cjf [a(i')] p di'J . 

(6) 

This law captures the physics of stress corrosion and 
of failure due to stress-assisted thermal activation and 
progressive damage. A system of 2™ fibers is fully 
specified by attributed to each fiber i = 1, ...,2 n at 
the beginning of the experiment a fixed failure time tj 
taken from the distribution 0. The failure time tj is 
by definition the time at which the fiber i would have 
broken if the stress had stayed constant equal to the 
initial value a. But, the fibers are coupled through the 
hierarchical load transfer rule defined above. As a con- 
sequence of the hierarchical structure of the load trans- 
fers occuring at each rupture, the stress applied to a 
given fiber may increase, leading to a shortening of its 
lifetime. 

Let us consider quantitatively the effect of the rup- 
ture of one fiber at time t\ on the other fiber of its pair, 
which would have broken at time ^ without this addi- 
tional load transfer. For a population of such pairs of 
fibers, the distribution of the time-to-failure for the re- 
maining fiber is obtained from by taking the stress 
equal to a up to t\ and equal to 2a from t\ up to the 
second rupture, which now occurs at a time t\z < 
itself function of t\ and t^'. 

P (ti2) = l-cxp{- K a p [t 1 +2P{t 12 -t 1 )}} . (7) 

Doing this calculation for the ensemble, the population 
of fibers must be the same since the population is ho- 
mogeneous at this level and Pq(£i2) should therefore 
also be equal to 1 — cxp (—Ka p t2)- Considering that 
t\2 is a function of t%, and identifying this expression 
with 0, we re-derive the fundamental result 11311 that 
the time-to-failure of a fiber is modified from its initial 
value ti to a smaller failure time t\i by the influence 
of the other fiber which has failed at the earlier time t±, 
according to: 

fia = t x + 2-'(t 2 - ii) . (8) 

The inequality 2~ p < 1 (for p > 0) ensures that 
t\ < ti2 < t2- This corresponds to a genuine cooper- 
ative process as the time-of-failure of the second fiber 
is decreased by the load transfer from the first fiber. 
This remarkable result holds for any realization of the 
stochastic process. Let us stress that this result now ap- 
plies not only at the level of individual fibers but at all 
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levels within the hierarchy: if t\ and t% are the lifetimes 
of two uncoupled bundles, then (|8jl describes the ef- 
fect of the rupture of the first bundle on the second one 
which sees its load doubling at time t\. The relation 
l|8) forms the basis for analytical as well as numerical 
simulations. In particular, an exact Monte Carlo cal- 
culation of the probability distribution of failure times 
of this hierarchical system indicates that the distribu- 
tion of failure times for the whole system is renormal- 
ized from Po(t) into a staircase (or jumps from to 
1) at a well-defined non-zero critical time t* , as the 
system size n tends to infinity, according to a gener- 
alized central limit theorem. It has also been shown 
theoretically and numerically that the rate of fiber fail- 
ures diverges (up to finite size effects) according to a 

power law - i/(t* - ty(p) upon the approach to the 

global rupture time t* for p > 1, where p depends on p 
fl4[l5ll . In our investigation below, we take k = 1 and 
p = 2. 

B. Third level prediction by conditioning the 
distribution of lifetimes on the observation of the large 
crack 

We address the central question of this paper, 
namely, how the revelation of information up to the 
present in the form of the partial knowledge of where 
and when fibers or groups of fibers have broken may 
be exploited to bracket better and better the realization- 
specific lifetime of a whole given system. 

In order to mimic a real-life situation, we consider a 
creep experiment of our hierarchical fiber system such 
that, at time 0, a stress a is applied. We have no ac- 
cess to the specific individual lifetimes of the individ- 
ual constituting fibers, only to their PDF po(x). At time 
passes, damage occurs, that is, fibers break, thus re- 
vealing their initial lifetimes. The situation becomes of 
course complicated because of the interactions between 
the fibers through the hierarchical stress-load defining 
the model, as the damage spreads accross the levels of 
the hierarchy. In a real-life experiment, the damage 
would be measured for instance by acoustic emissions, 
with both time and space localization giving informa- 
tion of which fibers have been broken and at what time. 

Our goal here is to construct schemes that uses some 
information in space and time on the damage that oc- 
cured until time t to form a better prediction for the 
rupture of the next level of the hierarchy and for the 
whole system, in the form of a PDF of lifetimes for the 
total system. 

Figure|8]gives an illustration of the space-time evo- 
lution of fiber damage for a system of 2 8 = 256 fibers. 
One can observe a transition from initial random un- 
correlated ruptures to a progressive organization with 
growth of "cracks"and fusion between "cracks" asso- 
ciated with the acceleration of damage up to the culmi- 



nation global failure. 

Now, suppose that we observe the evolution of such 
a system from time to some "present" time t, before 
complete failure. Furthermore, suppose that our mea- 
surement is imperfect and we do not have access to all 
the information on the position and times of individ- 
ual fiber failures. Let us assume that we only know the 
size 2 m of the larger crack (or bundle) that has bro- 
ken up to time t and some addition information on the 
fibers that broke within this crack at earlier times. Is 
this knowledge useful? Figure [9] shows two different 
measures of the cumulative number of broken fibers as 
a function of t (in log-log scales) for a given realiza- 
tion. The thick curve shows the unconditional cumu- 
lative number of broken fibers. The thin curve shows, 
as a function of time t, the cumulative number of bro- 
ken fibers, which broke either within the largest crack 
or within its complement in their pair within the hier- 
archy. It is worth emphasizing that the time-evolution 
of both cumulative damage is knowable at each time 
t. One can observe a striking difference, illustrating 
vividly the impact of conditioning on some available 
partial information on the on-going damage, in order 
to improve the prediction of the global failure: in the 
absence of conditioning (we count all broken fibers), 
one can observe mostly a linear increase and, only at 
the very end, can one see an acceleration (which is a 
power law of l/(t c — t) as shown in the inset); In con- 
trast, with the conditioning on the largest crack and its 
complement, the power law regime is extended to very 
early time. 

This result can not be stressed sufficiently: in the 
past two decades, material failure of heterogeneous 
materials have been shown to belong to the class of dy- 
namic critical phenomena (see for instance the review 
1 12] and references therein), but the critical region is in 
general difficult to observe and rather reduced in prac- 
tical situation, thus hindering the applications (this is 
why other techniques have been developed to enhance 
the predictability by extending the region over which 
critical information can be extracted 1 16, 17]]). What is 
remarkable in Figure [9] is that, focusing on the largest 
current crack and its neighborhood enhances the criti- 
cal region tremendously, thus offering a large potential 
for prediction at early times. 

Figure ^)| is the equivalent for the hierarchical rup- 
ture model of figure [6] previously constructed for the 
percolation model. It shows the root-mean-square 
(rms) of the error or difference between predictions of 
the global rupture time and the true realized one, for 5 
distinct prediction schemes using different condition- 
ing. The improvement due to conditioning is qual- 
itatively similar but quantitatily stronger than for the 
percolation model. This can be expected since the hi- 
erarchical bundle model has a dynamics in which the 
failure times of fibers keep the memory of past rup- 
tures: the failure of a fiber is a function of all the pre- 
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vious ruptures that impacted the load history on this 
fiber. In constrast, the rupture of a bond in the percola- 
tion is absolutely independent of past damage (except 
for the fact that the rupture occurs on remaining intact 
bonds, which is the mechanism underlying the benefits 
of conditioning exploited in previous sections). The 
existence of memory is expected and one can verify 
that it improves the prediction performance: we con- 
jecture more generally that, the larger the connectivity 
and interactions between elements, the better should be 
the improvement of prediction quality with condition- 
ing upon new information. 

VI. CONCLUDING REMARKS 

Our goal has been to demonstrate that one can pre- 
dict the percolation or rupture threshold, based on the 
knowledge of the amount of the current damage and on 
some information on the largest cluster or crack in the 
system. This problem was inspired by the idea of con- 
structing better predictors for earthquakes and ruptures 
based on a combination of the space and time organi- 
zation of damage. In this paper, which is the first of a 
series, we have first considered perhaps the worst and 
most difficult case for prediction, namely percolation, 
because in this model damage has no memory of the 



past and not space-time correlation exist other than the 
properties associated with the geometry of connectiv- 
ity. Similar results, not shown here, have been obtained 
for other lattice sizes L = 10 and L = 30, 40 and 50. 

Then, we have illustrated the robustness of the re- 
sults presented for the percolation model on one of the 
simplest model of time-dependent rupture, a hierarchi- 
cal fiber bundle model. We have shown that condition- 
ing the measures of damage on the information of the 
location and size of the largest crack extends signifi- 
cantly the critical region and the prediction skills. 

We will show in subsequent papers that the pre- 
dictions obtained in more realistic models of rupture 
which include realistic correlation in the space-time or- 
ganization of damage and of cracks are significantly 
better, still. But our goal has been reached here by 
showing that, in the worst possible and most difficult 
case for prediction, we can achieve significant gains 
by implementing the conditioning of some information 
on the spatial organization of damage. In our practi- 
cal implementation, we have considered the simplest 
information and many other algorithms can be devel- 
oped to improve on our results. This will be developed 
in future papers. 
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FIG. 1: Circles: standard probability distribution func- 
tion (PDF) Pl(Pc) as a function of the percolation thresh- 
old p c (in percent) for L — 20. Crosses: conditional PDF 
Pl{Pc\p = 0.5) conditioned on those systems which have 
not percolated for a fixed occupation density p = 0.5. Dots: 
conditional PDF Pl(p c \p = 0.53) conditioned on those sys- 
tems which have not percolated for a fixed occupation density 
p = 0.53. Squares: conditional PDF Pl(p c \p = 0.55) con- 
ditioned on those systems which have not percolated for a 
fixed occupation density p = 0.55. 
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Percolation threshold p 



FIG. 2: PDF Pl(p c \p,Pz) as a function of p c (in percent) 
conditioned on both p and p^ — 6%, where p^ is the fraction 
of sites belonging to the largest cluster, for different values of 
p (crosses: p = 0.4, dots: p = 0.45, squares: p — 0.5). For 
comparison, the unconditional distribution of the first predic- 
tion level is also shown with open circles. 
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FIG. 3: PDF Pl(Pc\p, £) as a function of p c (in percent) 
conditioned on both p and £ = 0.2L, where £ is the largest 
of the linear size projected on the x and y axes of the largest 
cluster within the system, for different values of p (dots: p = 
0.35, squares: p = 0.4). For comparison, the unconditional 
distribution of the first prediction level is also shown with 
open circles. 
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FIG. 4: Same as Figure [3] for a fixed p = 40% and dif- 
ferent values of £: £/L = 0.04 (crosses), £/L = 0.06 
(dots), HL = 0.08 (squares) and f/£ = 0.1 (triangles). 
The open circles represent the unconditional PDF Pl (p c ) for 
reference. 




FIG. 5: Panel a): Relative specific information gain I(p,P() 
as a function of p (in percent) for various values of = 
2%, 4%, 6%, 8%, 10% from left to right. Panel b): Relative 
specific information gain I(p, £) as a function of p for various 
values of £/L = 10%, 20%, 30%, 40%, 50%, 60% from left 
to right. 
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FIG. 6: RMS Q(p,Pz) (in percent) of the prediction er- 
rors denned by J4j with for the quantiles q — 5% and 
q — 95% of the distribution of £ at fixed p, when using 
P(p c \p, i q {p)) as the predictor, as a function of the damage 
parameter p (in percent). Triangle: q — 5%; dots: q = 95%; 
crosses: q = 50%; circles: Q(p) obtained using the condi- 
tioning only on p. This RMS Q(p,p$) should be compared 
with the standard deviation equal to 4.66% of the uncondi- 
tional distribution of percolation thresholds, to illustrate the 
gain in prediction accuracy deriving from the added informa- 
tion. 
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FIG. 7: Gain in RMS Q(p) - Q(jp,$) when adding the 
information on £, where Q(p,£) is shown as the triangles 
(q = 5%), dots (q = 95%) and crosses (q = 50%) and Q(p) 
is shown in figure|S|with the circles. 
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FIG. 8: A specific realization of the space-time evolution of 
fiber damage for a system of 2 8 = 256 fibers. The fibers are 
numbered sequentially from 1 to 256 along the vertical axis. 
When a given fiber i breaks at some time U, a symbol + 
represents the spatial position and failure time of this event. 
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FIG. 9: (Top) Two different measures of the cumulative num- 
ber of broken fibers as a function of time t for a given real- 
ization. The thick curve shows the unconditional cumulative 
number of broken fibers. The thin curve shows the (condi- 
tional) cumulative number of broken fibers, which broke ei- 
ther within the largest crack identified up to time t or within 
its complement in their pair within the hierarchy. (Bottom) 
This graph shows the same two curves in log-log scales with 
time t replaced by t c — t, where t c is the global time of failure 
(only known at the end). This log-log representation allows 
us to visualize the power law acceleration characterizing the 
final critical regime before complete rupture, which is much 
more apparent in the conditional cumulative number of bro- 
ken fibers. 
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FIG. 10: Root-mean-square (rms) Q(t) of the error or differ- 
ence between predictions made at time t of the global rupture 
time and the true realized one as a function of time t for 5 
distinct prediction schemes using different conditioning, sim- 
ilarly to figure [f>| previously constructed for the percolation 
model. The system used here has 2 s fibers and p = 2. The 
o symbols correspond to a prediction at time t of the fail- 
ure time t c based solely on the information that the system 
has not yet broken. For the other curves, we constructed the 
distribution of failure times over 10 6 realizations for the dif- 
ferent conditioning. The triangles correspond to the r.m.s. 
Q(t) obtained by using the 5% quantile of the distribution of 
failure times over these 10 6 simulations. Specifically, for a 
given system, and at a given time t, we measure the size £ of 
the largest failed cluster and then read from the distribution 
of failure times for the same time t and same cluster size £ 
the 5% quantile that we take as the prediction for the failure 
time. Similarly for the x and . corresponding respectively 
to the 50% and 95% quantiles. Note that in our system of 
2 8 fibers, there are 8 possible sizes of "cracks" larger than 
1, namely 2, 4, 8, 128, 256. These curves are obtained by 
averaging over 10 realizations. These RMS Q(p) for the 
five prediction schemes should be compared with the stan- 
dard deviation equal to 0.0311 of the unconditional distri- 
bution of failure times t c , to illustrate the gain in prediction 
accuracy deriving from the added information obtained from 
conditioning. 



